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INTRODUCTION 

In this paper we will present a method of encoding geometric line-* 
drawings In a way which allows sets of such drawings to be interpreted 
as formal languages , the purpose being to obtain a characterization of 
certain geometric predicates in terms of their properties as languages « 
Techniques usually associated with generative grammars and formal 
automata can then be applied to this geometric framework. 

By way of background, it should be mentioned that the results 
contained herein are an extension of work begun by William Rottmayer 
and myself in the sunmier of 1969, under the direction of Professor 
Patrick Suppes. Our effort was then Informally described as a study 
of geometric concept formation, as It might be applied to the learning 
of such concepts by children. The formulation presented here is an 
outgrowth of that summer's work. More extensive background material 
appears in Rottmayer' s report » "A Formal Theory of Perception." 

Our investigation had been motivated primarily by a study of 
perceptrons, which are formal machines capable of learning to recognise 
certain classes of geometric figures* A perceptron accepts input in the 
form of figures represented on a checkerboard-like grid, and through a 
series of reinforced trials, its internal coefficients are modified to 
converge to a state where recognition is perfect. The ability of a 
perceptron to learn to recognise even the simplest of geometric predicates 
is extremely limited, as demonstrated by Mlnsky and Papert in Perceptrons . 
Another framework for the kind of learning which can be applied to 
geometric concepts is suggested by Professor Suppes' paper *'Stlmulus- 
Response Theory of Finite Automata." This paper indicates that the 
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stimulus-response theory of learning can be adequately applied to the 
learning of any task which can be performed by a finite automaton; that 
is, for any such task, the finite automaton which performs it represents 
the convergent state of a stimulus-response learning model for that 
task. In view of the possibility that this result could be extended 
to include stronger forms of automata, in particular, various kinds of 
Turing machines, our goal in this paper is to classify certain geometric 
predicates on the basis of their recognition-potential by the various 
classes of automata. 
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SECTION 1 : GEOMETRIC FRAMEWORK 



The first problem is to specify the kinds of f^eometric figures 
which most naturally lend themselves to a simple learning situation. 
It should be noted that , whereas the perceptron was intented to be a 
general pattern recognition device with no particular emphasis on 
geometric aspects, ve have chosen to focus on those geometric aspects 
themselves. Roughly speaking, the kind of geometry we want to consider 
Is that portion of Euclidean geometry consisting of all finite straight- 
line drawings in the plane; that is, we take for our domain all planar 
figures made up of a finite number of line segments, each of a finite 
length. We will Identify each predicate defined over a domain of such 
figures with its extension as a set of line-drawings. Hence a figure 
F can be said to satisfy the predicate "contains a triangle'' if and only 
if F c {x|x is a figure containing a triangle). We want to exclude 
predicates which depend on the quantitative aspects of these drawings^ 
such as the exact measurement of area, length or angle, as well as those 
predicates which are a function of the orientation of the drawings, and 
8c on. For example, we would like to include such predicates as "is a 
triangle'* and "contains a triangle" while excluding "is an equilateral 
triangle" and "is a horizontal line.*' By committing ourselves to such 
restrictions on our figure domain we isolate a class of figured which 
is moderately rich in geometric properties and yet lends itself well to 
linguistic treatment. 

It seems natural then that the encoding of a figure should convey 
information about the lines present and how they Intersect one another. 
Certainly the translation of a particular instance of a figure into such 
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a representation as a coded set of line segments will represent a 
considerable abstraction. For example, any two triangles would be 
coded Identically, whether they be Isosceles, equilateral, or whatever: 
the coding would show three line segments, each pair of which has a 
unique endpolnt in common « Let us consider how an encoding might be 
derived for the following figure: 

Figure 1 

First assign to eacn vertex, that is, each endpolnt of a line segment or 
intersection of line segments, a unique Roman letter, as in Figure 2: 



Figure 2 




Each line segment in the figure will be encoded as an ordered sequence 
of those letters which correspond to the vertices on the line segment. 
Thus ABC tells us that one line segment in the figure has endpolnts 
A and C and central vertex B. A complete coding for the figure is a 
list of all the line segments that appear, e.g. ABC, BD, DEA, and EE. 
It should be evident that there is such a coding for every figure, 
whether the figure is in the form of a line drawing as given here or a 
set of Cartesian coordinates in the plan^^^^ This coding is not unique; 
the most obvious reason is that the vertex letters were assigned arbi- 
trarily. Furthermore, the line segments could have been listed in a 
different order and the vertices within a line segment could have been 
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listed in reverse order. However this lack of uniqueness will cause 

no difficulty because there is a simple alj^orithm for dete^rminin^ whe^Cher 

or not two codings apply to the same figure. 

The roost natural way to formalize this cod ing^^oce dure is to 
represent a coding for a figure as a set of codings for its line segments « 
relative to a given labelling of its vertices. Each line segment would 
be encoded by its vertex-labels, listed in order from endpoint to end- 
point. Thus the above example would become {ABC, BD, DEA, FE). Other 
codings for the same figure are {DB, FE, DEA, CBA} and {KXY, AX, KHA, WH}. 
For our initial discussion of codings we will use this set-theoretic 
notation, as its primary advantage is convenience. We soon realize though 
that the usual benefits associated with set theory do not accrue: ve can 
use unions and intersections of codings only with extreme care, and it 
is not at all clear what the complement of a coding or set of codings 
should be. 

We can now give a more precise definition of what we shall mean 
throughout this paper by the word "coding.'* Let (JL- {Aj, A2 , ... A^^, ...} 
be a countably infinite set: the A^'s will be vertex-labels , and we will 
use A, C, as variables ranging over the set OL. 

Definition 1.1 A line segment l£ is a finite list of elements from (X 
of length at least two and such that no A^ occurs more than once.^ 

By way of notation we let A denote the set of all such line segments 

and [is] the set of all vertex-labels occurring in a line segment Is. 

^Clearly we really mean to say "a coding for a line segment Is ..." 
but for convenience we will shorten this to "a line segment Is ..." when 
no confusion will arise* 

ERLC 
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The length of Is is denoted |ls| and is equal to [Isl* An endpolnt of Is 
is a vertex-label which occurs first or last In the list Is. 

Definition 1.2 A coding £ is a finite set of line segments such that: 
first, if Is^ E c and IS2 t c then [ls|") O [IS2I < 2, and second, if Is e c 
and A € (Is] then either A is an endpoint of Is or there is an Is' t c 
with A £ [Is' ] and Is' Is. 

We let C be the set of all codings. The conditions in Definition 
1.2 require that t^u pair of line segments can intersect more than once 
and that each V#ttex represent an endpoint , a genuine point of inter- 
section, or both. To sort out the many different codings which would 
correspond to the same figures, we give the following: 

Definition 1.3 Let = be a binary relation on C defined as follows: 
if Cj and C2 are codings then c^ H C2 if and only if there is a one- 
to-one function f mapping flsl onto [Is] such that 

Isec, ISGC2 

1) if Is e Cj and Is - Aj^^A^^ . • • then either f (An^)f (A^^) • • (An^^) 

e C2 or f (Aj^^)f (A^^^^) . . .f (A„^) e C2 , and co nversely, 

2) if Is e C2 and Is - A^^A^^^.-A^^ then either P(An^)g(A„^) . . .g(A„^) 
G Cj or P;(\j^)g(A^^^^)...g(Aj^j) e Cj, where g - 

This definition says that two codings are equivalent provided 
they differ only in the names of the vertex-labels present and in the 
order in which the line segments are taken. Note that [Is] Is the 

lS£C 

set of all vertex-labels appearing in the coding c. 

As an example, consider the codings c^ ■ {ABC, BDE, AE, AD, CE, CD} 
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and - {DBF, EAB, DB, BF^ AF» AD)| both derived from tht figure: 



Figure 3 




Then L/ (l8l • (ABC1 \J IBDEI Kj [ML] U lADl \J \CE] KJ fCD] • {A» 8, C, 
0» E) and similarly \J \\$) - (A, B, D, E, P)« To see that 5 

ISCCj 

let £: {A, B, C, D« E} (A, B, D, E, F) be defined by f(A) - D, f(B) - t. 
f(C) « F, f(D) " A, f(E) " B. The line seRment ABC In then corresponds 
to the line seRtnent f(A)f(B)f(C) - DEF In Cj. 

It Is clear from the nature of the definition that "s" Is an 
equivalence relation, and hence partitions C Into disjoint classes of 
equivalent codings. An obvious theorem relating equivalence for codings 
to figures in the plane is this: 

Theorem 1>4 If c^ is a coding for a figure F and c^ s C2« then C2 Is 
also a coding for F. Conversely, if and C2 are codings for F then 
c^ = C2« Thus each figure determines a unique class of codings. 

There are two important limitations of the codings as defined above. 
First, there are a number of geometric predicates which are inexpres5»lblet 
in addition to those more or less quantitative predicates which we 
specifically excluded earlier. Generally speaking, these are predicates 
which are dependent upon the preservation of inside-outside relationships 
within a class of equivalent codings. For example, the drawing in 
Figure 2 would have the same (that is, an equivalent) coding as the 
following figure, in which vertex F has been shifted to the interior 
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of the triangU: 




A sptcial cast of this typs of problstn is convexity: ve can tell from 
its coding whether or not a figure is a polygon, but there is no way to 
distinguish convex polygons from concave ones. Another example to show 
the extent to which deformations within a class of figures can affect 
the notion of interior and exterior is this: 

I" 

Figure 5 

A most important geometric notion which is related to this type of 
problem concerns the identification of the regions, or faces,^ into 
which a figure divides the plane. For example the following two figures 
have the same coding: 





Fiiture 6 




Figure 7 



Figure 6 has faces bounded by ABEFD and CBF whereas Figure 7 has faces 
bounded by ABCD and CEF. Although %ie have no means of explicitly Ident- 
ifying the vertices which bound the faces of a figure, it is a simple 

^A "face" is usually considered to be an area of the plane which 
is bounded by lines of a figure and which contains neither lines nor 
vertices in its interior. Every finite figure determines a unique 
Q^nfinite facet but we will follow convention and disregard it. 

ERIC 
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matter to show that the number of faces will remain constant within a 
class of figures having equivalent codings. To do this we use Euler's 
formula^, relating the number of vertices and line segments present in 
a figure to the number of laces: 

V - LS + F * N 

where V is the number of vertices , LS is the number of line segments 

(where a line segment coded as ABCD is considered to be comprised of the 

three line segments AB, BC and CD), F is the number of bounded faces, and 

N is the number of connected components making up the figure. In the 

previous example (Figures 6 and 7) V = 6, LS => 7 and N = 1, hence F * 

1 + 7 - 6 « 2. For a given equivalence class of codings, the number of 

vertices and line segments is clearly invariant, the latter being given 

by )^((lsl - 1), Also, the number of connected components can be deter- 
Isec 

mined from any coding by first grouping vertices into classes according 
to whether they are connected to one another by a path of line se^zinents, 
and then counting these classes. Hence the number of faces F can be 
determined as well, and is thus invariant. 

The sacond limitation of our present formulation has to do with 
the connection between codings and figures in the plane. We have said 
so far only that any figure determines a class of codings, but unfortu- 
nately it does not always happen that a coding determines a non-empty 
class of planar figures. In fact, the problem of specifying constraints 
on our system which will guarantee a correspondence between non-empty 
classes of codings and figures was studied by Rottmayer [pp. 56-61] and 
turns out to be extremely difficult. We will discuss some background 
material here, with the intention of indicating the scope of the problem. 

^For more material on Euler^s formula, see Berge, The Theory 
O of Graphs , p. 207 ff. 

ERIC ~ 
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We can bepin by defining a process called s epniekftat ion » which we 

will use to obtain new codings from a given one. This process will 

permit us to introduce a finite number of additional vertices in a line 

segment and to "break" the original line segment at each of these new 

points* Thus a line segment ABCD occurring in a coding c could become 

AA- , A, » A B, BCD under segmentation; we call the resulting coding 

1 1 ^ n 

a segmentation of the original coding. The idea of course is to use 
straight-line figures to approximate drawings containing one or moie 
curved lines. 

The process of segmentation allows us to classify non-planar codings 
into two varieties: one which we will call "geometric" and the other 
"topological." By a topological ly non-planar (TNP) coding we mean a 
non-planar coding which has no planar segmentation. We use the adjective 
"topological" because such codings will remain non-planar under the soil 
of elastic deformations of the plane that are studied in algebraic 
topology. It will be natural to think of such codings in terms of 
vertices connected by curved lines, since we can approximate such curves 
by unlimited segmentation without destroying the non-planar property. 
There are basically two such figures: Kuratowski proved in 1^30 that 
any TNP figure must contain either one or the other as a subflgure. 

The first is the "complete graph over five vertices," that is, the 
figure which contains only five vertices, with lines connecting each 
pair of vertices. An attempt to draw this figure will result in a figure 
with a missing vertex: 



F igure 8 
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The second consists of two sets of three vertices each, with lines 
dravm from each vertex In the first set to each vertex In Che second; 



Figure 9 




It Is worth inentloninf! the proofs of the non-planarlty of these two 
examples rely on the violation of Euler's formula. 

A non-planar coding will be said to be geometrically non-planar 
(GNP) If It Is not TNP, that Is, if some segmentation of it Is planar. 
The simplest example of such a coding Is c « (ABC, AED, BD, CE}. It 
would be "drawn*' as follows: 



Figure 10 




C 



Since line segments CE and BD must Intersect one another and there Is 
no vertex present In the coding corresponding to this Intersection, 
this "flj?ure'* cannot be embedded In the plane. However, by segmenting 
the line segment CE for example, we get a drawing which Is certainly 
planar: 
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Another example of a GNP coding is Riven by c - {ARCD, DEP, CEG, BG, AF), 



and V Duld be drawn: 



Figure 12 




Figures 11 and 12 are distinct in that neither contains the other as a 
subflRure, nor is there a common GNP subfigure. Obviouslyt any coding 
which contains either of these two examples will also be non-planar. 
These two fijnires are alike, however, in that they both make use of the 
fact that a triangle is the only necessarilv convex polygon, and hence ♦ 
when a side is extended, the endpolnt must lie outside the triangle. 
There are other GNP figures though which do not seem to rely on this 
fact, and hence it seems unlikely that a Kuratowski-type result can be 
established for the GNP codings. 

Although the difficulties posed by these non-planar figures are 
serious ones, it will still be possible to interpret each coding as a 
class of plane figures^ with possibly missing vertices. None of the 
simple geometric properties, like connectedness or triangularity, will 
be affected by this. However, throughout the following, it must be 
rentembered that the correspondence between codlnj^s and figures is not 
an exact one. 
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SECTION 2 ! GRAMMA RS AND AUTOMATA 

In this section we will devcslop the necessary background material 
on formal languages, grammars and automata, in order to establish 
notation and to state the fundamental results which we will need 
later. Our notation is based on that used in Forma l Languag es and 
yi?AX R ela tion to Automata, by Hopcroft and Ullman. This will allow the 
use of additional theorems appearing there without re-statement* 

By an alphabet, or vocabulary, we will mean a finite set V, whose 
elements can be concatenated to form strings or words, V* denotes the 
set of all strings of finite length over the alphabet V. We use the 
symbol e to denote the empty string and define V"*" « V* - {el. 

Definition 2>1 A quadruple G ■ <Vj^, Vj, P, S> Is a grammar whenever 

1) and are non-empty finite disjoint sets. We let V « 

2) P is a finite subset of (V+x V+) VJ { (S,e) }• 

3) S € V^. 

It is customary to refer to V^, V^, P, and S as the variables 
(non-terminals) , terminals* production rules and start symbol, respec- 
tively^ of the grammar C. A production rule (ni,6) c P will be repre- 
sented by a B» Note that we do not allow general rules of the form 
a e. 

Given a grammar G we can define a binary relation on V* as 

G 

follows: a -^-^ 6 if and only if there exist x e V* and y e V* such 

G 

that a ■ xXy, 6 ■ xX'y, and X •> X' r. P. Thus ■►p represents a one-step 
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derivation for the grammar G. We can extend -^^ in the natural way to 
a new relation -^q* to represent derivations of any finite number of 
steps, that is, a "►q*^ there is an integer n > 0 and 

Oj, aj, • • • , e V* such that ° "^(^ ^1 » ^1 "^G ^ "^G °^n " ^* 

By the language generated bjr G, denoted L(G) , we will mean the set 
{a|a e V,j,"** and S -^q* a}. One stipulation is made regarding the rule 
S e: If this rule occurs in P, then the only derivation in which it 
may be Used is the one-step derivation S -^q* e. We will say that two 
grammars G^ and G2 are equivalent if and only if L(G^) • L(G2). 

We can classify grammars according to the form of their production 
rules. The most general type of grammar, having no restrictions on 
the production rules, is that which is defined above, called a Type 0 
grammar. If all the production rules of a grammar have the form S e 
or a B where |a| < |b|, the grammar is said to be Type 1, or context- 
sensitive. A grammar whose rules are of the form A B where A e V^^ will 
be called a Type 2 or context-free grammar. A Type 3 or regular grammar 
is one whose rules have the form S-^e, A-^v, orA-^ vB, where S, A, 
B G Vj^ and v e V^, It follows that any Type i grammar (i • 1, 2, 3) is 
also of Type (i-1); furthermore, grammars of different types may be 
equivalent. The language generated by a Type i grammar will be called 
a Type i (or context-sensitive, context-free, regular) language. 

Definition 2 .2 A finite automaton is a quintuple 3"« <K, E, 6, q^, F> 
where 

1) K and E are finite non-empty disjoint sets (the internal states 
and input alphabet respectively) 

2) 6: K X E K (the transition function) 
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3) qo K (the start symbol) 

4) F S K (the rot of final states) 

A finite automaton processes an Input string left to rlRht, 
changing states with each new Input symbol as determined by 6, until 
the end of the string Is reached* The string Is said to be accepted by 3* 
If and only If the last state reached In the processing of the input 
string belongs to F. As with graamars It is natural to extend 6 to 
6*: K X E* K by the Inductive definition 

1) if V c E, then 6*(q,v) - 6(q,v) 

2) if a c r*, then for v c I, «*(q,av) » fi<«*(q,a) ,v) . 
Thus a is accepted by if and only if 5*(qo»a) £ F. 

An Important limitation of any finite automaton is Its bounded 
memory capacity. For example, let Z » {a, b) and let « {a%a"|n > 0}. 
A finite automaton processing a string in can remember only a bounded 
number of a^s from the first half of the string, that number being 
determined by the number of states; beyond that, the automaton must **loop" 
before getting to the b, and hence it will lose count. When this 
happens, the automaton cannot determine whether the number of a's after 
the b is the same as the number before the b. Thus no finite automaton 
can accept the language L^. 

The following is a fundamental theorem in the theory of automata. 

Theorem 2.3 A language is regular, that is, has a regular grammar, if 
and only if there is a finite automaton which accepts it. 

Context-free languages can be similarly characterised in terms 
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of the type of automata which accept then. A pushdown automaton Is 
essentially a finite automaton, with left->to*-rlght input processing, 
which has in addition an unbounded memory in the form of a pushdown 
stack, operating on a first in-last out basis. Such a machine can 
accept ("recognise") a language like defined above: as the machine 
moves left through the initial string of a*s, it can store a symbol for 
each occurrence; then after passing the b, it can compare each of the 
following symbols with a symbol in its storage. Thus, while it cannot 
"count" as such, it is capable of making comparisons. By a non-deter- 
ministic pushdown automaton, we mean a pushdown automaton which at some 
or possibly every stage in its computations has more than one possible 
move it can make. The theorem for context-free languages is this. 

Theorem 2.4 A language is context-free if and only if there is a non- 
deterministic pushdown automaton which accepts it. 

We note that a simple context-free grammar which generates is 
given by G » <{S}, {a, b}, P, S> where P consists of the two rules 
S aba and S ^ aSa. 

There are similar theorems for characterizing languages of type 0 
and 1; we will use the notion of a Turing machine for that purpose. 

Definition 2.5 A Turing machine is a 6-tuple T « <K, T, 6, q^, F> 
where 

1) K and V are non-empty finite disjoint sets (the states and 
the tape symbols of T) 

2) I (the input symbols) 
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3) e K and F CK (the start state and the set of final states) 

4) 6: Kxr-*-Kxrx {Left, Right} (the transition function) 

In its usual interpretation, a Turing machine consists of an input tape 
divided into cells, each capable of holding an element of r, and a tape 
head which can scan the cells of the tape one at a time. The input tape 
has a leftmost cell but is infinite to the right, and the input is placed 
initially into the first n cells. The tape head stores the current 
state symbol and after scanning the contents of a cell, it will change 
the current state symbol, replace the cell contents with another symbol 
from r, and move one cell to the left or right, all of these actions 
being governed by the transition function. The machine starts in state 

with the tape head positioned at the leftmost tape cell. It continues 
to move through the successive configurations of states and tape symbols, 
and may come to a halt in one of two ways: either by entering a final 
state, or by reaching a combination of state and tape symbol for which 
6 is not defined. It is also possible that the machine will never halt. 
An input tape will be accepted if and only if the machine processing the 
tape eventually halts in a final state. 

It is convenient to provide a special symbol B for a blank cell, 
with B e r £ and such that the range of the transition function is 
restricted to K x (r - {B}) x {Left, Right}; that is, blanks can only be 
removed and cannot be printed* The input string, which is finite in 
length, would then be followed by an Infinite number of blank cells; 
this eliminates the problem of having to add new cells to the right end 
of the tape when more cells are needed. 

There are many different formulations of Turing machines: non- 



18 



detertninistic one8> ones having two-way Infinite tapes or even two- 
dinvensional tapes, others with two or more distinct tapes and tape heads, 
each under independent control, and so on. Each of these formulations 
can be shown to be equivalent to the original description. However, one 
formulation which is not equivalent is known as a linear - bounded automaton 
(LBA). This is a non -deterministic Turing machine whose tape length is 
fixed by the length of the input; in other words, the tape head cannot 
move beyond those cells which contain the input. We can partially over- 
come this restriction by changing the form of the input by "padding*' the 
real input with a number of blank cells; this limits us to a length of 
cells which is a linear function of the real input. It is unknown whether 
deterministic LBA's accept a smaller class of languages than non-deter-* 
ministic LBA's. 

Theorem 2. 6 A language is Type 0 (Type 1) if and only if it is accepted 
by some Turing machine (LBA) . 

The results mentioned here establish a hierarchy of automata in 
parallel with the previous ranking of formal grammars. To em|>hasize 
this point, we could have defined finite and pushdown automata as 
special cases of Turing machines. The former is a Turing machine which 
moves right only and cannot change the symbols it sees. The latter is 
similar, but with a memory tape which is accessible only from one end 
and whose length is bounded by the length of the input. The four central 
types of automata can thus be seen as progressively weaker versions 
of a Turing machine. 
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SECTION 3: GEOMETRIC PREDICATES 

We can now translate our discussion of codings from the informal 
set-theoretical notation used earlier to that of formal languages. The 
natural way to do this is to interpret a set c •* (ABC* BDE, OFG, FA), 
for example, as a string of symbols c' - ABC#BDE#DFG#FA, where 
serves as a line-delimitation symbol^. Since the order of elements in 
a set is irrelevant, we could obtain 23 other strings from the set c 
just by thinking of c's elements in a different order: these other 
strings will be equivalent to the first under an appropriate definition. 

Before we can apply the techniques and theorems of formal languages, 
however, it will be necessary to represent codings as strings of symbols 
over a finite alphabet. We will want to discuss figures having an 
unbounded number of vertices, so we cannot provide distinct alphabet 
symbols for each of the vertices which we might encounter in coding an 
arbitrary figure. We shall do this by denoting the n^^ vertex as "A" 
followed by n primes, which we will indicate in shorthand as A(n), or 
A'*'... ' * (n times), or occasionally, A(')^. Thus we can use an alphabet 
having only three symbols, A, #• Using this system, a triangle could 
be coded 

(*) A(1)A(2)#A(2)A(3)#A(3)A(1), that is, 

(**) A'A"#A"A"*#A'"A'. 

In general, a triangle would have the coding 

(***) A(i)A(j)#A(j)A(k)#A(k)A(i), where i, j, and k are 

distinct positive integers, or some permutation of the line segments and 

vertices within this string (eight altogether) , 

^We use instead of for line-'delimitatlon in order to avoid 
confusing symbols which are used both formally and informally. 
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An alternative alphabet with only two symbols, 0 end 1, could 
equally well have been used, by representing A(n) by 01^ and # by 0» 
The above codings would become 
{****) 01011001101110011101, and 

oi^oiJooiJoi^ooi'^01^. 

A quick check will indicate that neither of these formulations Is 
anibiguous, as it might seem at first. Another method for coding would 
be to use directly the symbols occurring in (*) , namely the alphabet 
(A, #, (, ) , 0, 1, • • . , 9}, where the vertices would now be distinguished 
by their indices in base ten, rather than by the number of primes. 

In the following presentation we shall use the alphabet V « (A, '} 
for coding as in (**) above, because it provides a graphic representation 
which is easier to read than (****) and easier for a machine to scan than 
the alphabet using base ten Indices, owing to the smaller niisaber of 
symbols. Some results about different coding procedures will be given 
later. 

Definition 3.1 Is c V*, where V • {A, #, is a line segment^ if and 
only if Is • A(n^)A(n2) A(nj) where J > 2, and each n^ is a positive 
integer with 1 ii< 1' ai^ n^ 9< n^t » We will denote by A the set of all 
line segments* 

This definition is clearly a translation of our previous notion of 

a line segment. Notice that the symbol ''A" must be followed by one 

or more primes to be a vertex-label: this conforms to our earlier con- 

^Again we use the term "line segment" rather than the more accurate 
term "coding for a line segment". 
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vention of using A(n) for the n^" vertex. For terminology, we will 
refer to a vertex A(n) in a line segment Is c A as an endpoint of Is 
if it happens that A(n) occurs as either the first or the last vertex 
in Is; otherwise A(n) is an interior vertex of Is* 

Defljiltlon 3.2 c c V* Is a coding if and only if 

1) c - ls^llflS2# /l^ls^ where each Is^ c A and n > 0, 

2) no more than une vertex-label A(m) occurs simultaneously in 
Is^ and Isj when i J • and 

3) if A(m) is an interior vertex-label in some line segment lS|^ 
then A(m) also occurs as a vertex-label in some line segment 
ISj where i ^ ]. 

We will denote the set of codings by C. 

As before we let [Is] be the set of all vertex-labels which occur 
in the line segment Is; hence rA'*A'A""T « (A' , A' \ A" " } . We will 
denote by fc]^ and \c]^ the set of all vertex-labels and the set of all 
line segments respectively in a coding c^. 

Definition 3*3 Let a binary relation s be defined on C by s C2 
if and only if there is a one-to-one function f mapping fc^l^ onto 
[C2ly such that 

1) if Is is a line segment in c^ and Is ■ A(n^)A(n2) A(nj), 

then either f (A(nj))f (A(n2)) . . . f (A(nj)) t fcjl^ or 

^Put into the terminology of formal languages, we can describe fc| 
as the set of all substrings of the coding c which are maximal with 
respect to the property of not containing the symbol and a similar 
characterization applies to fcl^. 
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f (A(nj))f (A(n^^^)) f(A(n|)) c 1^2^^ conversely, 
2) if Is is a lint ssgvntnt in C2 and Is « A(n|)A(n2) A(nj), 
than althar g(A(np)K(A(n2)) ... R(A(nj))c fc^jj or 
g(A(nj))R(A(nj.j)) g(A(n|)) c (cjlp where g - f"^ 
If ci 3 C2 ve call Cj^ and C2 equivalent codings. 

Definition 3.4 A predlcete P is a subset of V^, where V is defined 
as above* 

In the usual set-theoretic sense, predicates are Identified as 
subsets of a specified set, but in the present formulation, It is 
possible to think of predicates as languages as welt, in particular, as 
languages over the alphabet V. Hence certain predicates can be said to 
be regular, context-free, and so on« The predicates we have in mind will 
be geometric in nature: for example, the predicate K of connectedness 
would be defined K-{ccV*|cl8a coding of a connected figure in the 
plane). Thua K is a sublanguage of C* That K is well-defined is a 
consequence of the fact that C is well-defined and that ''connectedness'* 
Is a generally recognised property applying to geometric figures. 

An exasiple of the type of results we wish to discuss is the 
following: 

Theorem 3.5 The predicate P(x) given by "x is the coding for a line 
segment*' is context-free and not regular. 

Proof: The predicate P is given by (A(n)A(m)|n ^ m). The 
following context-free grammar will generate these strings and no others: 

G - <V„, V^, P, S>, where V„ - (S. S^. Sj, Sj), - {A, 
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and where P is Riven by 



1) 



S AS 



5) 




3) 



2) 




6) 



7) 



S AS 





B) S3 



Rules 1-5 will generate strings of the form A(n)A(m) with n > m > 1 
and rules 4-8 will generate similar strings with m > n > 1. The form 
of the rules shows that G is context-free. 

This predicate cannot be regular becau<se the memory limitations of 
a finite automaton fail to guarantee that n and m will be distinct 
integers for sufficiently large values of n and m. 

This theorem illustrates a problem which wc will deal with at some 
lengtht a problem which causes even the simplest Imaginable predicate , 
short of the "empty" predicate, to fail to be regular. The difficulty 
lies in counting and comparing the number of primes occurring after an 
"A** in a coding. Simply put, a finite automaton cannot tell one vertex 
from another, unless a bound is placed on the vertex superscripts. This 
limitation is combinatorial rather than geometric, and we suggest one 
way of overcoming it: 

Definition 3 ^6. An initial coding c is an element of C in which the 
only vertices appearing are A(l), A(2), A(nK where n is the number 
of distinct vertices present in the coded figure* If P is a predicate 
of codings^ then Pj will be the predicate consisting of all codings in 
P which are initial. In particular, C* ^he set of all Initial 



codings • 



24 



This definition reflects the natural way a person would assign 
vertex-labels to a figure, starting with A(l) rather than with A(1703), 
Using initial codings, a line segment would have only two distinct 
representations > A*A'* and A**A*» Notice that for any coding c there 
is an initial coding which is equivalent to it, and furthermore, that 
we can speak of initial codings , but not of initial strings in general* 
We will discuss initial and non-initial predicates in parallel until 
the two notions can be shown to converge. 

We now prove some results about the predicates which are recog- 
nisable by finite automata^ 

Theorem 3.7 Let c be any coding and let be the predicate given 

by "x is an initial coding which is equivalent to c". Then Pj(x) 
is regular. 

Proof: Immediate since there are only a finite number of initial 
codings which are equivalent to a given coding, and any finite language 
is regular. 

As an example of such a predicate « consider the set of all initial 
codings for triangles. One such coding is A'A"M' 'A' "#A' "A' , There 
are seven other ways to order the vertices within each line segment and 
six ways of ordering the lines, giving 48 distinct codings, each of 
which is initial and equivalent to the first. 

It is easily proved that regular languages are closed under the 
set operations of union, intersection and complementation. As a results 
initial predicates like the following are also regular: 

1) X is not a triangle, 
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2) X is either a line or a quadrilateral ^ 

3) X is neither a triangle nor a quadrilateral* 

Note that the predicate Pj(^) given by "x is not an inital coding 
equivalent to c" is satisfied not only by codings which are not inital 
and by codings which are initial but not equivalent co c, but also by 
strings in V which are not codings* Thus the complement of a language 
must be taken with respect to V* and not C or C^, In fact, it will 
be shown later that the predicate ?j(x) mentioned above will not be 
regular in any case, for any coding c. This illustrates the result of 
taking complements with respect to C , instead of V*. 

We now examine the (non-initial) predicate T(x) given by *'x is a 
triangle'*. In view of our previous results, we might expect T to be 
context-free; surprisingly, this is not the case. 

Theorem 3.8 The predicate T(x) given by "x is a triangle" is context- 
sensitive and not context-free. 

Proof: We will not give a proof of the second half of the theorem 
similar languages are proved to be non-context-free in Ginsburg 
(p. 88 ff.]. However, the techniques required for this proof will be 
used later in this section. To show that T is context-sensitive, we 
give a complete grammar which will generate all and only codings for 
triangles. It should demonstrate the complexities involved in handling 
a simple, non-initial predicate. We use the notation: 

X A/B for X A and X B, 

X(A,B) Y for XA Y and XB Y, and 

X<A,B) (A,B)X f or XA AX and XB BX. 
The production rules are as follows: 
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S •* QiCjBjT 

T L^L2L2/L^L2L2/L2LjL2/L2L2Lj/L2LjL2/L2L2Lj 

I -* XY/YX 

XZ/ZX 
L3 YZ/ZY 

II -* C^Cj 
Ql QiDi 

B^(X,Y,',#) ■* (X,Y,',/»)B^ for i - 1, 2 

III B^Z ■*■ Z'B2 
BjZ Z' 

Cj(X,',#) (X,',//)C^ for i - 1, 2, 3, 4 

IV C.(Y,Z) ^ (Y',Z')C^^j for i - 1, 2, 3 
C^(Y,Z) -> (Y',Z') 

D^C,/0 ^ (',i!»)D^ for i - 1, 2, 3, 4, 5, 6 

V D^(X,Y,Z) (X- ,Y' ,Z')D^^j for i - 1, 2, 3, 4, 5 
Dg(X,Y,Z) ^ (X- ,Y',Z') 

Q^(X,Y,Z) ^ A'Q^^j for i - 1, 2, 3, 4, 5 

VI Q^rj) ^ for i - 1, 2, 3, 4, 5 
Q^(X,Y,Z) A' 

(Total: 104 rules) 
It can be seen from the rules that ■ {S, T, L^, L2, L-j, B^, 
Cj, C2, C3, C^, Dj, D2, D3, D4, D5, Dg, Qp O2, Q3, O4, Q5, Qg, X, Y, Z). 
The rules in groups I and II are used to select one of the 48 possible 
orderin^s of line segments and vertices in the coding of a triangle 
and then to generate the proper number of primes for each vertex. The 
rules In groups III, IV and V shift these primes to the imnediate 
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right of the appropriate vertex symbols. The last group of rulds forces 
the cotnplete application of all previous rules before generating the 
final terminal string. 

Note that the preceding gri&mmar can be generalized to produce all 
codings of n-gons for a fixed value of n, and hence this predicate is 
also context-sensitive. 

Theorem 3.9 C and arc context-sensitive languages. 

Proof: The suggestion to use LBA*s in this proof was originally 
due to R. Roskies* The plan is to enter a string of symbols from the 
alphabet V « {A, ' } as input to a suitably defined LBA. To ensure 
that the tape head does not leave that portion of the tape which 
contains the input string, it is customary to surround the input with 
end-markers, C on the left and $ on the right. The LBA will proceed 
through a series of tests on the input, each of which will leave the 
input string unchanged upon completion. The tests we have in mind 
are these: 

1) check that each maximal substrinsr of the lnpi:t containing 
no #'s satisfies the definition of a line segment, 

2) check that no pair of such line segments has more than one 
vertex in common, 

3) check that each vertex occurs either as an endpoint of a 
line segment or at least twice In the coding. 

In the case of we make an additional test: 

4) check that the only vertices which occur are A(l) , A(2), .•• 
A(n) for some n* 
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We now give an explicit program which performs the first task. This 
will illustrate the technique involved and indicate that an LBA is 
adequate to perform such tasks. 
Routine Rl 

The transition function is: 
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of the input. 



symbol except C* The 
start state and the final 
state are q^ and q^, 
respectively. 
Routine R2 

The transition function is: 



q^ A q^ A R 



q^ A <\2 ^ ^ 
q2 • qs X R 



R2 starts at the leftmost A in the 
first line segment and checks that 
no other vertex in the same line 
segment has the same number of * 
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symbols following It. If such is the 
case, the second vertex is similarly 
tested, and so on* When every vertex 
in the first line segment has been 
tested, the LBA proceeds to the 
second and succeeding line segments. 
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The start state and the 
final state are q and q.. , 

O 21 . 

respectively. 

The second test uses the techniques employed In R2 above: the 
automaton proceeds through each pair of line segments comparing all 
vertices, one from each line segment, and halting In a non-final state 
If a pair of line segments have more than one vertex in common. 

The third test is almost identical with the second, except that 
it is necessary to check for duplication only those vertices which do 
not occur as endpolnts, and they mst be checked against all vertices 
which occur. 

The fouKh test breaks into two parts: first, the longest string 
of primes is located and replaced with X's. Then the input is checked 
for a string of primes which is exactly one symbol shorter than the 
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string of X's. If one is found « ve replace one X with a ' and repeat 
the test until only one X remains. The LBA then replaces this X with 
a ' and halts in a final state. 

The following theorem (3.11) will be of fundamental importance 
when discussing the limitations of formal languages in geometric 
applications. However, we first state a result which provides a 
partial characterization of context-free languages « due originally to 
Bar-Hillel, Perles and Shamir. It is referred to variously as the 
"uvwxy" theorem in Hopcroft and Ullman fp. 571 and the "xuwvy" lemma 
in Ginsburg [p. 8A].** 

Lemma 3.10 Let iC be a context-free language. Then there exist 
integers p and q with the property that any z e X with |z| > p can 
be written as z - uvwxy where |vwx| < q, either v e or x e, and 
where - uv^wx^y e for each integer i > 0. 

Theorem 3.11 C'Star" Theorem) For n > 1, let ST(n) be the coding for 

an (n-l)-pointed "star" figure, A(1)A(2)#A(1)A(3)# #A(l)A(n). Let 

P be any predicate of codings which contains ST(n) for infinitely many 

n. Then P cannot be context-free. 

Proof: Let P be any such predicate of codings, that is, P C 

and ST(n) e P for infinitely many n, and suppose that P is context-free. 

We can clearly select an n > 0 and a z « ST(n) c P such that |z| > 

where p is determined by P as in the lemma. Hence z * uvwxy. We shall 

^The "uvwxy" theorem provides a necessary condition for context-free 
languages, but it is not known whether the condition is also sufficient. 
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examine v (or x If v ■ e). 

First, since v can be assumed to be non-empty, suppose v contains 
an occurrence of the symbol #, that is, v ■ o#B, where o and B e V* (and 
possibly one or both is empty). Then - o<f6o#6a# H ^ a(#6o)^"^#B. 
If o and B are both empty, then ■ uv^wx^y will contain a repetition 
of the # symbol i times, and if either a or B is non-empty, will 
contain i-1 repetitions of #6a. Either way, even if 6a should be a 
well-formed line segment, will not be a well-formed coding, regardless 
of what wxy looks like* Thus v cannot contain #• 

Now suppose that v contains the symbol A. If so, then v is A*A(')'', 
•A(')J, or A(')J for some integer j > 0. Hence will be A'A(j)A'A(j) 

A'A(j), •A(j+1)^"^A(J) , or A(j)^. None of these substrings can 
occur in a coding, since each represents a repetition of vertices within 
a line segment. Thus v cannot contain the symbol A. 

Hence v must be a string of ' symbols. Suppose v consists of all 
the ' symbols following some occurrence of an A In z. Unless vxy ■ e, a 
possibility we can avoid by taking n > q+1 , then v is bounded on the left 
by an A and on the right by either an A or a #. By lemma 3.10, z^ - uwy 
£ P. But uwy will have an occurrence of either an AA or an A#, neither 
of which can occur in a proper coding. 

Therefore, v is a string of k ^ 1 ' symbols, which occurs in a line 
segment A*A(n) in z, where n > k. In z^ ■ uwy, this line segment becomes 
A'A(n-k) if n-k > 1, or it becomes A'A' : in the first case, it duplicates 
a line segment occurring to the left of A'A(n), in the second it becomes 
ill-formed. The occurrence of x, if x 9^ e, will not affect the argument, 
since it is to the right of v. Thus we see that z cannot be represented 
in such a way that z^ e P for each i > 0; hence P is not context-free. 
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Corollary 3,12 Neither C nor is context-free. 

Proof: Clearly ST(n) c S C for all integers n > 1. 

Theorem 3.9 and Corollary 3,12 characterize the two methods of 
coding geometric figures. Using these results we can now demonstrate 
the convergence of these two methods at the context-sensitive level. 

Corollary 3.13 Part I Let P be any predicate of codings, that is, 
P S C, which is context-sensitive. Then the corresponding initial 
predicate P^ is also context-sensitive. 

Part II Conversely, let P^ be any predicate of initial codings, 
that is, Px £ which is context-sensitive and which is invariant 
under coding equivalence: if x c and y e and x = y, then 
Px(x) if and only if P^Cy)- Then the extension of P^ to non-initial 
codings, defined by P(x) if and only if for some y, x = y and ?j^(y) ^ 
will also be context-sensitive. 

Proof, Part I: To convert from P to P^ we need only add an 
additional test to the routines used in the testing of P, namely a 
verification that the input is initial, which in Theorem 3,9 was seen 
to be context-sensitive. 

Part lis Let P and Pj satisfy the hypotheses of Part II of the 
corollary. To determine if P(x) holds, we first check that x is a coding, 
then we transform x into an equivalent initial coding y and check 
whether P(y) holds. The first and last of these tests are known already 
to be within the scope of some LBA, so we need only show that an LBA can 
transform a given coding into an equivalent initial one. Note that we 
can always find an equivalent initial coding whose length is not greater 
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than that of the original coding; it is not true that an initial 
coding is always no longer than a non-initial equivalent coding. 

We can define the LBA which converts a coding to an initial one 
as follows: we first locate the shortest string of ' symbols in the 
input and replace it with a single tentporary prime-inarker M followed 
by enough N symbols to fill out the string. Then the rest of the 
input Is scanned for strings of * symbols having the same length , 
which are treated similarly. Then the next longest strings of ' symbols 
are located and replaced by two M's followed by N*s, and so on. When the 
process is complete^ the N*s will be eliminated and the M*s will become 
' symbols again. The effect is to convert the vertex of lowest index 
into an Ad), the vertex of next lowest index into an A(2) , and so on. 
It is clear that such a process can be carried out within the given 
amount of tape. 

An Interesting question is whether the preceding corollary holds 
when "context-sensitive" is replaced throughout by "context-free" or 
"regular". Obviously Part II will fail under either replacement. The 
predicate Tj(x) given by "x is an initial coding for a triangle" is 
regular (Theorem 3.7) and hence also context-free, but T(x), the 
extension of Tj(x) to non-initial codings, is neither regular nor context- 
free (Theorem 3.8) . 

Part I of the Corollary is problematic in that we have virtually 
no examples of non-initial predicates of codings which are regular and 
only one which is context-free (Theorem 3.5). For this one example. 
Part I certainly holds, but for now we will leave the question open, for 
the reason that an answer to It will not be likely to have any application 
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to the kind of predicates we are dlacuselnR here. 

We conclude this section with some results about previously- 
mentioned predicates. 

Theorem 3.14 The predicate K(x) given by *'x Is the coding of a connected 
figure'* Is context-sensitive and not con text-* free. 

Proof: The almple and obvious technique for handling connectedness 
with an LBA Is this: we first check that the Input Is a proper coding 
and then mark with a new symbol X each of the vertices In the first line 
segment. Then we Iterate two routines until no change results: the 
first Is to mark with an X each additional occurrence of vertices already 
so marked, and the second Is to mark with X*s each vertex occurring In 
a line segment which contains an X*-marked vertex. This will Identify 
those vertices which are connected to the first vertex. If all vertices 
are marked at the end of the process, then the figure Is connected. 

That K Is not context-free Is a consequence of the '*Star*' Theorem. 

Theorem 3.15 Let c e C and let P(x) be the predicate "x c C and 

X t c". Then both P and P^ are context -sensitive and not context-free. 

Proof: Neither P nor P^ Is context-free, for the "Star** Theorem 
applies to each, even If c ■ ST(n) for some n > 1. 

An LBA to recognise elements of P or P^ would first determine 
whether or not the Input Is a coding and In the case of P^ whether or 
not It Is Initial. Since there are a finite number of Initial codings 
equivalent to c, the LBA would then transform Its Input to an Initial 
coding If It Is not already and compare the result with each of the 
possible Initial codings. 
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In vUv of Thaor«in 3, 15, v« might ask whether the complement of sny 
context-sensitive predicate remains context-sensitive when taken with 
respect to C« The answer is chat we simply do not know at this point, 
for this question seems to be a special case of the larger problem of 
whether the class of context-*sensitive languages is closed under 
complenientation. An affirmative answer to this will imply that the 
class of languages accepted by deterministic LBA's is the same as that 
accepted by nondeterministic LBA*s* 

One last predicate which was mentioned earlier is the "figure in 
context" predicate, an example being "x contains a triangle". We can 
tell if a given figure contains another as a subfigure by a correct 
succession of vertex deletions, removing the lines connecting each 
vertex at the same time. On the level of codings, the process would 
work like this: if c is a coding and A(n) is a vertex in c^ we can 
obtain a new coding c* from c by first deleting every occurrence of 
A(n) in c» then deleting any vertex A(m) which had occurred as the only 
other vertex on a line segment A(n)A(m) or A(m)A(n) in the original 
coding Cf but deleting only those occurrences of A(m) which are as Just 
described. Last^ each time we delete a vertex A(m) we must also remove 
one of the # symbols which had been adjacent to it. For example, 
suppose we wish to remove vertex A(3) in the following figure: 



A(2) 
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A coding for this figure would be: 

A(l)A(5)A(2)#A(5)A(3)A(4)#A(l)A(3)#A(2)A(3)#A(l)A(A)#A(2)A(4). 
To delete vertex A(3) we first remove each occurrence of A(3): 

A(1)A(5)A(2)#A(5)A(4)#A{1)#A(2)#A(1)A(4)#A(2)A(4), 
then we delete the occurrences of A(l) and A(2) in the third and fourth 
line segments respectively » together with one of the # symbols for each: 

A(1)A(5)A(2)#A(5)A(4)M(1)A(4)#A(2)A(4). 
The resulting coding represents the following figure: 



Figure 2 




Thus if F is a figure « we can recognise F within the context of 
another figure by some succession of such deletions. 

Theorem 3>16 The predicate Pp(x) given by "x contains F as a subfigure" 
is context-sensitive. 

Proof: We can design an LBA which takes an input string, checks 
to determine if it represents a coding, and then converts the coding 
symbol by symbol to ordered pairs: A becomes (A, A), * becomes ('/) 
and # becomes (#,#)• Then the LBA tries successive deletions of vertices 
writing the result in the second coordinate of these ordered pairs, and 
replacing each deleted symbol by a null s3nnbol N. After each deletion, 
a new coding would be initialized and compared with each of the 
finitely many initial codings for F« When a particular succession of 
deletions fails to find a coding for F, the original coding is rewritten 
into the second coordinates, and the process starts again. The LBA halts 
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in a final state only when a coding for F is found. 

Corollary 3.17 The predicate **x is a TNP coding*' is context-sensitive. 

Proof: We use Kuratowski^s theorem and the preceding result to 
locate one of the two "forbidden'' subcodings within the given coding. If 
one or both is found, the coding is TNP. 

The foregoing results amply suggest that context-sensitive 
languages play the central role in characterizing those types of 
geometric properties under discussion here. Even with the restricted 
amount of "geometry" available to us, the full power of context-sensi- 
tivity was necessary to handle all but the simplest figure-classes, 
and we feel that there is every reason to believe that similar invest- 
igations will reach much the same conclusions. 

The disturbing aspect of this study is that linguistics as it is 
applied to the study of natural languages, centers upon the context-free 
property; in fact, Ginsburg refers to languages which are not context- 
free as "non-languages". Thus much linguistic analysis cannot be 
extended to include geometry, even in the simple formulation presented 
here. Among the implications of this fact we can mention the non- 
existence of derivation trees for codings and the impossibility of 
left-to-right generation of terminal symbols In codings, both of which 
are used in the analysis of natural languages. 
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SECTION 4 : INVARIANCE THEOREMS 

Many of the results in the preceding section seem to depend 
heavily upon the choice of coding procedure employed; It Is natural 
to ask whether some other method of coding figures would have given 
more optimistic results. The theorems In this section will help to 
generalize the results In Section 3 to a larger class of coding 
procedures • 

Definition A,l A finite automaton with output (FAO) Is a quintuple 
^ - <K, A, q^, 6> In which 

1) K Is a finite, non-empty set (the set of states), 

2) E and A are finite non-empty sets (the input and output 
alphabets, respectively), 

3) q^ c K (the start state) , and 

A) 6: KxE -> KxA (the transition function). 

An FAO operates like a finite automaton except that instead of 
accepting or rejecting an input string, it generates a new string of 
symbols, of length equal to the length of the input string. Thus we can 
think of an FAO as a function from E* to A*. We will define a non-det- 
erministic FAO analogously, except that we now allow 6: K x E 
(P(K X E) - {^}, the set of non-empty subsets of K x E. When 3" is in 
a state q and scanning a S3nnbol v e E, the range of possible moves for 3" 
is any (q'.w) t 6(q,v). If a is an input string, we will let 3'(a) 
denote the set of all output strings which can be generated from a by 
the NDFAO We shall think of deterministic FA0*8 as special cases 
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of NDFAO's, so that many theorems about the latter will apply a fortiori 
to the former. 

As an example of an FAO, consider ^^^o^* 
q^, 6>, where 6 is defined: 

- (qo,0) 

If the string A' A' ' #A' ' A' ' ' #A' ' ' A* is used as input for ^, the output 
is 01011001101110011101. The effect is to transform a coding in the 
sense used in Section 3 into one of the similar codings mentioned on 
page 20. 

Definition A. 2 Let *C and JC' be languages over the alphabets W and W 
respectively. Then is a regular transform of X, in case there is 
a NDFAO ^^ such that is the image of iC under 1?, denoted X' » 

VJhen X is a specified language and ^ is a NDFAO whose input 
alphabet includes the alphabet of Jf, we will call ^(^) the regular 
transform of under 

We have defined a regular transform with respect to the most general 
type of FAO, namely, non-deterministic, and this much generality may not 
always be necessary. Clearly the case which is most interesting is when 
the FAO relating two languages is not merely deterministic, but one-to-one 
as well. The reason for the definition as it stands, is that it will be 
useful to be able to define the inverse of an FAO which would not in 
general be an FAO unless the original FAO were supposed to be one-to-one. 
Note that even when NDFAO' s are employed, the Image of a finite language 
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is finite and the image of an infinite language is Infinite. 

Remark ; If JL^ is a regular transform of Jf, may not be a regular 
transform of it' , even if the FAO is deterministic and one-to-one on 

For example, let ^- {a%"|n > 0), if is context-free and not 
regular. Let !^ - <{qo), (a, b}, {l}, q^, 6>, where 6 is given by 
«(qo*a) - (qo,l) and S(q^,b) - Then ^(-C) - JC' - {l"|n is an 

even positive integer), and cSl* is regular. By the following theorem, 
iC. cannot be a regular transform of JCJ • 

Theorem A. 3 If X is regular and Jt' is a regular transform of if, then 
is regular. 

Proof: Let G • <V^, V.J., P, S> be a regular grammar for and let 
the NDFAO be ^ - <K, V^, W, q^, 6> where Jt' S W* and S'iSL) - iC' . 
Define a new grammar G' « x K, W, P', (S,q^)> where the rules In P' 
are specified as follows: 

If A vB is a rule in P, then for every q e K, P' will contain 

all the rules (A,q) w(B,q') where (q',w) e 6(q,v). 

If A V is a rule in P, then for every q c K, P' will contain 

the rules (A,q) w where for some q' t K, (q',w) e 6(q,v). 
Now we prove that - L(G'). Let a € ' where a - ^1^2 ^n' S^^" 
a t iC\ there is a B e ^« 0 * v^V2 v^, such that a t ^(B). G 
generates gt^ so we have a sequence of rules In P: S ^1^1 » ^1 ^2^2* 
... S^_^ v^. Also since o t 9 we have in K a sequence of states 

qo» % »"ch that (qo»Vi) (qi,Wj), (qi^Vj) (q2.W2), ... (qn-l^v^) 
(^j^n^i that is, (q^tWj^) t 6(qi^i,Vj^) for i - 1, 2, ... n. Hence in 
P' we have a sequence of rules (S,q^) wj(Sj ,qj^) , (Sj,qj^) -^W2(S2,q2)t 
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Conversely, let a z L(G'), a • WjW2».. w^. Then there is in P' a 
sequence of rules (S^q^) -^wjCSi^qi), (Si^q^) -^W2(S2,q2), ^^n-l'^n-1^ 
w^. Hence we have for some sequence of v^'s in V,j. that (qj,Wj) c 

*(^o'^l^* (^2*^2^ *(qi>V2), ... » ^^n'^n^ *^^n-l»^n^* Hence there Is 
in P a sequence for rules S VjSj, ^ V2S2 , ... S^^i -* v^. Thus B ■ 
VjV2 E oZT, and a e ^(6). So a e otll . 

Thus we have proved that ^f' S L(G') and L(G') S so the two 

are equal. 

Theorem 4. A If is context-free and is a regular transform of 
then oJ^' is context-free. 

Proof: Let iC & V*, let G » <V^, V, P, S> be a context-free 
grammar for and let ^» <K, V, q^, 6> be a NDFAO mapping X,to 

where e^C* ^ W • We may assume that the rules in P are in Chomsky 
normal form, that is, each rule is of the form A BC or A -> v, where 
A, B, C c and v e V. For each A e and q e K define CL(A,q) » 
{q' € k|a -►q* a for some a e V* and (q,w)e 6*(q',a) for some w e W}, 
where by 6* we mean the natural extension of 6 to strings [see p. 15]* 
Intuitively ^(A,q) is the set of all states in K which are in a sense 
*'final'* states when 5 is applied to a string a with q as start state, 
where a is some terminal string generated from the variable A. Thus 
CLCS^q^) would be the set of "final" states for strings in X* Notice 
that if A is a variable in which occurs in at least one derivation in G 
then A -^g* a for some a € V*, and hence ^t(A,q) is non-empty for such 
variables. 
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Define a gratranar G' - <Vj^ x K, W, P\ (S,qQ)> where the rules in 
are specified as follows: 

1) If A BC is in P then we will include in P' each rule of the 
form (A,q) (B,q)(C,q') where q is any element of K and q' 

€ a(B,q). 

2) If A V is in P then we will include in P' all rules of the 
form (A,q) w where for some q' e K, (q',w) c 6(q,v), 

Notice that G' is a context-free grammar. 

Part I: Let 0 - w^w^ . • . w^ e • There is an a ■ v, . . . v c 5^ 

i^n'^ 12n 

with B t 3^a) and hence in K we have elements ^i^^, . . • q^ such that 

(qj.Wj) € 6(qQ,Vj), (qj.Wj) e fiCqi^Vj), ... ('In'^n^ ^ ^^%-1'^n^* 

D be a derivation of a in G, arranged so that all rules of the form 

A BC precede rules of the form A v. Thus S ^1^2 *** \ some 

sequence of variables in Vj^, such that v^^ is in P for i ■ 1, 2» ... 

We will establish that (S,q^) (Aj ,q^) (Aj ,q j) ... (An><l^«i) in G' . 

Certainly for some sequence r^, r^, ... r^ in K we have (S,qQ) 
-^Q? (Aj,r^) (Aj^rj), . .(A^,r^^j) , since each rule A BC used in the 
derivation of ' * ' ^n counterparts in P' of the form (A,q) 

(B,q)(C,q') for all q e K and for q' t fl(B,q), since A(B,q) is non- 
empty. Thus the derivation S -►q* ^® duplicated in G'. 
We will use induction to prove that it is possible for " ^o' ^1 * ^1^ 
^n-1 ■ %-V 

First note that r^ ■ q^ as a result of the construction of the rules 
the second component of the leftmost variable in any derivation in G' of 
a non-terminal string from (S^q^) must be q^. 

Now suppose the result to be true for i-1 , that is, that (S,q^) 
(Aj,q^)(A2,qj) ... (A^ jq^^j) (A^^j ,rj) (Aj^,r^^j) in G' . In the 
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derivation tree of a, locate the least upper bound X of the variables 
and A^^.^. Then for variables Y and Z In we have that X YZ Is 
In P; that either Y - or for some j (1 < j < 1) Y ^j^j+1 *•* ^ 
and that either Z « ^i+i some k .(1+1 < k < n) Z 

Aj^. In P'' we have the rules (X,q) (Y,q)(Z,q') for each q e K and 
q' e 4(Y,q) , and In particular (X,qj_j) (Y,qj_j) (Z,q') where q' e 
CL(Y,qj_j). Furthermore notice that since Y VjV^^^ . . . v^ In G and 
(q^,w^) e ••• '^i^^ ^^^^ ^1 ^ ft(Y,qj_i), and 

therefore (X,qj_j) (Y,qj_j) (Z^q^) Is In P'. 

Notice now that by Induction (X,qj_j) must occur as a variable In 
the derivation of (A^ ,qQ) (A2 ,qj) ... ^^i-i^ ^ ^i+i ^^i^ ••• ^^n'^n-1^ 
and hence it is possible for r^ " "^^^^ the induction step is 

complete and (S,q^) (A^ ,q^) (A2 ,qj) ... (Aji^^n-l^* 

It remains to show that (A^ ^^i-i) ^1 In P' for 1 » 1, 2, ... n, 
but this follows from the fact that A^ v^ Is In P and (^^9^^) ^ 
6(q^_j,v^). Thus B e L(G'). 

Part II ; Now let B - WjW2 . . . w^ e L(G'). In G' we have a derivation 

^^i»^o^ ^^2*^1^ ^^n*^n-l^ rules (A^,q^^j) w^ for 1 « 

1, 2, ... n. Each rule In P' of the form (A,q) (B,q)(C,q') corresponds 
to a unique rule In P, A BC, so In G we have S ^1^2 ^* 
addition the rules (A^,q^_j^) w^ correspond to rules A^ v^ In P, 
where each Is such that (qj;*^^) ^ 6(qj; -1 ,Vj: ) . Thus we have S a « 
VjV2 . . . v^ in G. So a e , and g e ^(a) since (qjjWj) e 6(qQ,Vj), 

(q2,W2) e 6(qj,V2), ... (q^fW^) e *^^n-l*%^* ^ ^ • 

Parts I and II show that L(G*) both contains and Is contained In 
Jt', hence L(G') « Jt' and iC Is context-free. 
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T heorem U.5 Let JC be context-sensitive and be a ref^ular transform 
of ol^. Then «C! Is context-sensitive. 

Proof: Let <K, V, W, q^, 6> be a NDFAO which maps ^i^ onto 

"iC » where X S V and ^* S W . We define a special three- track LBA T 
to recognise words In ' . T will accept a word 0 « wiW2 ... w^^ e W* as 
Input and change each symbol Wj^ e 6 to a triple (wj^, 0, 0)« If 6 c iC' 
then there Is an a ■ "^1^2 ^n ^^^^ ^^^^ 6 ^ T will find a as 

follows: first T generates nondetermlnlstlcally a string a e V with 

|a| " n, and stores a symbol by symbol In the blank second coordinates of 
the Input* Then T applies 6 to a nondetermlnlstlcally, generating 
elements of ^(a) and storing each in turn In the third coordinate space 
of the Input. When an element of ^a) Is found which equals 6, T will 
then test a to determine whether a e X$ using the fact that 0C Is context- 
sensitive. If not, T will discard a and start over with another a' € V 
with |a^| " n. T will halt In a final state when an element A of ^iT Is 
found with B c At each point In the test to determine whether a 

particular a e V* belongs to X.^ T can nondetermlnlstlcally exit and 
begin computation over again with a (possibly) different value for a. This 
will prevent T from becoming bogged down In computations for an Incorrect 
value of a. 

Corollary 4.6 If ^ Is Type 0 and iC' Is a regular transform of iff, 
then X* Is also Type 0. 

Proof: The technique employed In Theorem A. 5 generalizes to Turing 
machines, which may need more computing space to determine whether a e •it, 
but otherwise work the same. 
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One major limitation of regular transforms would seem to be the 
fact that the length of the output string must equal that of the input 
string* There are tricks however which can help overcome this problem 
in some special cases. Suppose our coding procedure had been this: the 
n*^^ vertex is coded by l'^; 0 is a vertex-delimitation symbol; ft is a 
line-delimitation symbol. The triangle 
* A(1)A(3)M(4)A(1)//A(3)A(4) 
would be coded 

** 10111//111101//11101111. 

The number of symbols in * is 24, the number in ** is 21. If we modify 
the second coding procedure slightly, we can make the number of symbols 
in a ''new" coding correspond to the number of symbols In the original. 
Suppose we consider the ''new" coding to be over the alphabet {0, 1, it ^ S}, 
where S is a new symbol, and the procedure is modified so that each 
coding starts with an S and the lines are now delimited with instead 
of # as before: 

*** S10111##111101 ■lllOllll. 

It is easily seen that this procedure results in codings which have 
the same length as codings in relative to the choice of vertex-labels. 
If the change is not objectionable, we now have a coding procedure which 
is a regular transform of our original one. The FAO which accomplishes 
this is given by: 

^^^^o' ^1» ^2^* ^A, #}, {S, 0, 1, //}, q^, i> where 
5 is defined by 

- (S,qi) 6(#,qi) - (If .q^) 

«(%qi) - (l,qi) fiCA.qj) - (» .q^) 

^ «(A,qi) - (0,qi) 
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We can use similar tricks to ''shrink'' or "stretch^' the output 
string, but a better way Is to generalize our definition of regular 
transform to Include the case where a NDFAO can output a finite number 
of symbols and possibly the empty symbol at each step, Instead of 
exactly one symbol. The appropriate definitions are these: 

Definition A>7 A generalized NDFAO (GFAO)^ is a quintuple « <K, 
^ » ^» ^o* where 

1) £, and A are finite non-empty sets, 

2) qo t K, 

3) 6 Is a function from K x I to non-empty finite subsets of 

Definition A. 8 If it V* and X' ^ 3^00 for some GFAO ^, then 
Is called a generalized regular transform (GRT) of 

Theorem A, 9 The GRT of a regular (context-free) language Is regular 
(context-free) . 

Proof: Let be regular and let G be a regular grammar for 

We can find a grammar G' for as we did In the proof of Theorem A* 3: 

the rules In P' will now be of the form A aB and A a, where A and B 

are variables of G' and a t W*, possibly a « e. To show that X* Is 

^These definitions and the following theorems were discovered to be 
minor variations of what Is known In the literature as a generalized 
sequential machine and similar results for so-called "gsm mappings'* can 
be found In Hopcroft and Oilman [pp. 128-130]. However, the results 
given here were developed Independently as extensions of the notion of 
a regular transform and the proofs In this paper are In no way related 
to the proofs In Hopcroft and Ullman* We Include these results for 
completeness. 
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regular we will replace the rules in with new equivalent ones. 

First, if A aB is in for some a - VjVj *** \ where n > 1, 
we delete this rule and add new ones A VjXj, Xj ^2X2, ^n^* 
where X]^, X29 X^.i are new variables added to G'. Thus we can delete 
all such rules A aB and similarly all rules A a, where |a| > 1. 

Second, any rule of the form A eA can be dropped without replace- 
ment, even if the rule is S eS, since such rules have no effect on a 
derivation. If A S and the rule A eB occurs in P' for some B ^ A, 

it too can be deleted, provided every occurrence of A in the remaining 

if 

rules is replaced by B. It should be clear that if S ^1^2 *** 
G' before these changes and A occurs in this derivation, then all 
instances of A's in the derivation are replaced by B's and the derivation 
still holds. Cycles of such rules A eB, B eC, C eA, and so on will 
cause no difficulties. If rules of the form S ->> eB occur, they can be 
deleted similarly, with B becoming the new start symbol. 

All the remaining rules are of the form A-^-vB, A-^-v, orA-^e. 
Any rule A e, where A is a variable which never appears on the right- 
hand side of some other rule, can be dropped, since such a rule can never 
occur in a derivation; the rule S ->> e can be allowed to remain. If 
A e Is a rule in P' and A does appear on the right-hand side of some 
other rule B vA, we can delete A e from P' and add B v. 

These changes in production rules clearly give a regular grammar, 
and one which is equivalent to G'. Thus X' is regular. 

The proof for context-free languages is similar* The rules we will 
obtain using the technique in Theorem 4.4 will have the form A BC or 
A a, which are context-free rules, with the exception of the rules 
A e. Hopcroft and Ullman prove [pp. 62-63] that these so-called e-rules 
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can be added to context*fraa gramnara without daatroylng the context- 
free property. 

Theorem 4 #10 Let SC be context-senaltlve* let be a CFAO, and let 

Then ^* will be context-sensitive, provided the following 
condition la met: there Is an Integer n > 0 such that whenever a t jC\ 
there is a 6 t SC such that a c t^(6) and |b| < n|a|. 

Proof: The condition amounts to a linear bound on possible pre- 
Images for words in 3L\ An LBA which recognises words In X* would 
operate In a manner similar to that described In the proof of Theorem 
A .5, except that all words must be checked having length up to n times 
the length of the Input. 

Corollary A>11 The GRT of a Type 0 language Is Type 0, 

Proof: The method employed In Corollary 4.6 generalizes to CRT's. 

These theorems Indicate that the tricks used on p. 46 «re not 
necessary to deal with various coding procedures which are similar to 
that used In Section 3. For example, to transform 
A'A'"#A""A'#A'"A"" 

Into 

loiiimuoiniioiiii 

we can use a CFAO whose transition function Is defined by 
{(q^.A) - (q^.e) 
«(qi,') - (qi.l) 
<(q^,A) - (q^.O) 

«(qi,#) - <%,*) 
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Notict that tht linear bound nquirftrntnt imposed by Theorem 4«10 is 
met* with n • 2« A CFAO which ^^reverses*' the above example would have 
transition function 6^ defined by 

«Xqpl) ■ (qp*) 
«*(qi.O) - (qj.A) 

«*(qp#) - (qo**)- 
For the results In Section 3 to apply fully to a new coding method 

we must have a CRT of the original language into the new one and con- 
versely. We have already seen that the CRT of a non-regular language 
is not necessarily non-regular. However the existence of the two GPAO' 
mentioned above is sufficient to prove the following: 

Theorem 4*12 All theorems proved in Section 3 remain true when our 
original coding procedure is replaced with the procedure described in 
** on p. 46, 
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SECTION 5: FURTHER INVESTIGATIONS 

The most important remaining problem Is that mentioned at the end 
of Section 1| namely the Identification or characterization of the class 
of geometrically non-planar figures. Of course a simple linguistic 
characterization for this class would be ideal « but this is perhaps more 
than we can reasonably expect* As one sees in Section 3» it becomes 
more apparent that, as Type 1 and even Type 0 languages begin to play 
the central role, there is a shift from linguistics to a more mechanical 
or computational aspect. In this sense, then, a linguistic treatment may 
not be entirely satisfactory. The following theorems are an illustration. 

Theorem 5*1 The set of planar codings is recursively enumerable » that 
is, the language of planar codings is Type 0. 

Proof: We will describe a procedure for checking codings which will 
tell us when a coding is planar. If we are given a coding c with n 
distinct vertices, we look at the set of points in the Cartesian plane 
with coordinates (x,y) where 1,2, ...n and y"0, 1, 2, ...n. 

These points form an n x n grid. We now assign coordinates from this 
set to the vertices in c, trying to find a figure which is a realization 
of c. If a realization for c is found, then c is planar. If all pos- 
sible ways of assigning coordinates fail to produce a realization for 



A realization of c * {ABC, 
' *ADE, EF, CCD, BGF} on a 
6x6 grid. 
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c» we try an n+1 x n+1 grid. By continuing in this manner, we will 
find a realization for c if one exists* If c is non-planar » this 
procedure never terminates* 

To see that this procedure does terminate when c is planar, we 
note that if c is planar, then certainly some realization for c can be 
found in the Cartesian plane , in which the vertices have rational 
coordinates* This follows partly from the fact that when a line 
segment has rational endpoints, then any interior point which has one 
rational coordinate, has both coordinates rational. Further, ve can 
"shrink" this realization so that it lies within the unit square 
bounded by (0,0), (0,1), (1,0), (1,1), and such that its vertices still 
have rational coordinates* Let n be the least common denominator for 
all these rational coordinates: Chat is, each coordinate can be 
expressed as x/n, v;here 0 < x ^ n. Thus each vertex vj has coordinates 
(xj/n ,yj^/n) , and hence we can consider c to have a realization on an 
n X n grid, where v^'s coordinates are (Xj^,yj^). 

For each coding c there is an integer m^ which has the property 
that If no realization for the coding can be found using an m^ x m^ 
grid, then no realization exists* the problem lies in determining 
what this integer is for a general coding. If, for example, it could 
be shown that there is a linear function f with the property that 
f(!cj) > (m^)^, then the language of planar codings would be Type 1. 

Theorem 5*2 The class of GNP codings is recursive* 

Proof: In 1930 Alfred Tarski gave a decision procedure which applies 
to sentences in the elementary theory of Euclidean geometry, using the 
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notions of equality, betveenness and equidistance. The geometric system 
used in this paper Is a subsystem of Euclidean geometry, so Tarskl's 
algorithm will apply To use the algorithm, ve take a coding c and 
write a sentence which expresses all equality and betweenness relations 
holding for vertices In c, Including statements about line segments In 
c which do not Intersect one another. The algorithm will transform this 
Into a sentence of elementary algebra Involving real polynomials. The 
truth of this sentence is decided using powerful techniques from real 
analysis. As an example of the sentence we derive from a coding to 
apply the algorithm, consider the GNP coding {ABC, AED» BD, CE} given on 
p. 11. The description of this figure (coding) would be: 

(aA)(aB)(ac)(aD)(aE){(A^B) & (a^c) & (ai^d) & (a^e) & (b^c) & (bt^d) 

& (B^E) h (C^D) & (C^E) & (D/E) & b(A,B,C) & b(A,E,D) & (yX) fbCB ,X,E) 
=^ not-b(C,X,D)l ]. 

A most promising area for further study is graph theory. Little 
has been done in graph theory with straight-line graphs, but Interest Is 
picking up, especially in connection with the four-color problem. In 
1948 the graph theorist F5ry proved that any planar graph ^ can be real- 
ized in the plane using straight lines. This means, for example, that 
the coding mentioned above is realizable using straight lines, provided 
the vertices B and E are no ^longer considered to be interior vertices 
but endpolnts. Thus the coding {AB» BC, A£, ED, BD, CE} Is planar. It 
Is worth outlining the proof of this theorem. 

The proof uses Induction on the number of vertices In the graph. 

are Indebted to Professor Halm Galfman for pointing out that 
Tarski*s algorithm applies In this situation. 

^By a ''graph'' we will mean a coding in which every line segment or 
'*edge'* has length 2, and such edges need no longer be straight lines. 
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Suppose the result is true for graphs having n or fever vertices and 
assurne G to be a maximal planar graph with n+1 vertices. This means that 
no unconnected pair of vertices can be connected without making G non^ 
planar. Since any planar graph with n+l vertices Is contained within 
such a maximal graph, it will suffice to find a straight-line repre- 
sentation for G, An interior vertex v is selected, and the vertices 
^If V2 » v^ which are adjacent to v are located and listed in cyclic 
order. We can assume m > 3. It follows from maximality that there is a 
minimal path enclosing v having edges VjV2, ^2^3* ^m^l' 
remove vertex v together with the edges connecting it to the adjacent 
vertices, and add edges v^v^j^ v^v^, ^l^m-1 already present. 

The resulting graph Is maximal planar and has n vertices, so we can find 
a straight-line representation for it. The construction of this graph 
makes it possible to reinsert the vertex v Inside the path VjV2... v^ 
in such a way that v can be connected to each v^ using straight-lines.® 
Here is an example showing an application of the theorem: 




®For additional details, see Ore, The Four Color Problem , [pp. 5-8]» 
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reinserted. obtain straight-line 

representation for A. 

The reason for devoting so much space to this theorem and its proof 

is that we can apply it to a particular grammar which generates a class 

of planar straight-line graphs* We give a collection of production 

rules which allow us to proceed from a given coding to a larger one; 

these codings will be expressed here as sets of line segments as in 

Section 1. Our starting point is S • {AB}. The rules are; 

1) If XY e T or YX e T and Z is a vertex not appearing in T, then 
from T we derive T U{YZ} and also from T we derive (T - {XY}) 
KJ {XZ, ZY}. 

2) If XY e T and YZ e T then from T we can derive T U {XZ}, 
provided this latter coding contains neither Kuratowski 
subgraph. 
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One suggestive way of visualizing these rules is the following: 
our start symbol is " ^ " and the production rules correspond to 

these: ^ 

The figure A on p.5A might be "derived** as follows: 




and a corresponding sequence of codings would be: 

S « {AB} {AB, BC} {AB, BC, CA} {AB, BC, CA, BD} {AB, BC, 
CA, BD, DA} (AB, BC, CA, BD, DA, DE} {AB, BC, CA, BD, DA, 
DE, EA} {AB, BC, CA, BD, DE, EA, DF, FA}* 

The first thing we notice about these rules Is that they seem to 
bear no relation to the rules of a formal grammar, which cannot be used 
for set generation. This Is con^ensated for In their Intuitive clarity, 
for it Is true that the language which they generate Is context-sensitive, 
and one can only Imagine how intuitive the rules for a formal grammar 
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for the same language would be. As for the class of graphs, or rather 
the class of codings for graphs, which is generated, we can apply Fury's 
theorem to show that any maximal planar graph can be so derived. Note 
that the production rules given are specially suited to the generation 
of triangles y and that any maximal planar graph is '^triangulated'^ that 
is, each face is bounded by a path of length 3 (if not, we could draw a 
diagonal which bisects the face^ contradicting maximality). However, it 
is not possible to generate all planar graphs with these rules; here is 



The last rule used in generating this graph cannot be rule 1« for this 
results in a vertex of order one or two (that is, with one or two adjacent 
vertices), and cannot be rule 2, for this rule completes a triangle, and 
the graph contains no triangles* 

The preceding example indicates the direction in which notions of 
grammar and formal language must be extended for application to geometry 
and similar areas. Work is currently being done on ^'indexed" languages, 
in which the introduction of new vertex-labels in a coding can be 
handled more naturally, and on ''graph'* grammars, which are quite similar 
to the above example. Applications for such grammars are being increas- 
ingly found in the areas of picture processing and pattern recognition. 



an example : 
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